Skew detection and text line position determination in digitized documents
نویسندگان
چکیده
-This paper proposes a computationally efficient procedure for skew detection and text line position determination in digitized documents, which is based on the cross-correlation between the pixels of vertical lines in a document. The determination of the skew angle in documents is essential in optical character recognition systems. Due to the text skew, each horizontal text line intersects a predefined set of vertical lines at nonhorizontal positions. Using only the pixels on these vertical lines we construct a correlation matrix and evaluate the skew angle of the document with high accuracy. In addition, using the same matrix, we compute the positions of text lines in the document. The proposed method is tested on a variety of mixed-type documents and it provides good and accurate results while it requires only a short computational time. We illustrate the effectiveness of the algorithm by presenting four characteristic examples. (~ 1997 Pattern Recognition Society. Published by Elsevier Science Ltd. Skew detection Hough transform Character recognition Segmentation
منابع مشابه
Modified Self-organizing Maps for Line Extraction in Digitized Text Documents
. Different authors have developed modifications of the Kohonen Self-Organizing Maps to solve known combinatorial optimization problems. In this paper a modification of the Kohonen Map is proposed to solve the detection of white inter-text spaces in a digitized plain text documents. The idea relies on the fact that line extraction problem has several features which match easily with Kohonen net...
متن کاملDocument Decomposition of Bangla Printed Text
skew, Auto rotation. Abstract: Today all kind of information is getting digitized and along with all this digitization, the huge archive of various kinds of documents is being digitized too. We know that, Optical Character Recognition is the method through which, newspapers and other paper documents convert into digital resources. But, it is a fact that this method works on texts only. As a res...
متن کاملSkew Detection Technique for Various Scripts
This paper includes the information about the technique used to detect Skew which are introduced during the scanning of the documents. It also discusses about the tool which have been used to implement the technique. The algorithm has been implemented on various scripts. The method provides a very efficient way to calculate the Skew. Correction in the skewed scanned document image is very impor...
متن کاملResolution Independent Skew and Orientation Detection for document images
In large scale scanning applications, orientation detection of the digitized page is necessary for the following procedures to work correctly. Several existing methods for orientation detection use the fact that in Roman script text, ascenders are more likely to occur than descenders. In this paper, we propose a different approach for page orientation detection that uses this information. The m...
متن کاملLocal Skew Correction in Documents
In this paper we propose a technique for detecting and correcting the skew of text areas in a document. The documents we work with may contain several areas of text with different skew angles. First, a text localization procedure is applied based on connected components analysis. Specifically, the connected components of the document are extracted and filtered according to their size and geomet...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 30 شماره
صفحات -
تاریخ انتشار 1997